{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "ec47c906-9d22-41bb-936c-481708f468dc",
   "metadata": {},
   "source": [
    "# Hybrid Fusion Retriever Pack\n",
    "\n",
    "This LlamaPack provides an example of our hybrid fusion retriever pack."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "6fc7f92e-f69d-40fd-b6f6-14497e37e844",
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install llama-index llama-hub rank-bm25"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "f80be184-d951-4b34-a510-d7ab979c257d",
   "metadata": {},
   "outputs": [],
   "source": [
    "import nest_asyncio\n",
    "\n",
    "nest_asyncio.apply()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d3f75749-2091-4951-948b-b5e700049316",
   "metadata": {},
   "source": [
    "### Setup Data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "84ff1173-f49a-4d6d-94b0-d867c982627c",
   "metadata": {},
   "outputs": [],
   "source": [
    "!wget \"https://www.dropbox.com/s/f6bmb19xdg0xedm/paul_graham_essay.txt?dl=1\" -O paul_graham_essay.txt"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "404a4b3f-736a-42ab-81f0-cbf4b04ca463",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index import SimpleDirectoryReader\n",
    "from llama_index.node_parser import SimpleNodeParser\n",
    "\n",
    "# load in some sample data\n",
    "reader = SimpleDirectoryReader(input_files=[\"paul_graham_essay.txt\"])\n",
    "documents = reader.load_data()\n",
    "\n",
    "# parse nodes\n",
    "node_parser = SimpleNodeParser.from_defaults()\n",
    "nodes = node_parser.get_nodes_from_documents(documents)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "646d9629-869f-492c-9a2c-512e27203cd0",
   "metadata": {},
   "source": [
    "### Download and Initialize Pack"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "7f6c49eb-c173-45a6-9ea6-ed9f21452609",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.llama_pack import download_llama_pack\n",
    "\n",
    "HybridFusionRetrieverPack = download_llama_pack(\n",
    "    \"HybridFusionRetrieverPack\",\n",
    "    \"./hybrid_fusion_pack\",\n",
    "    # leave the below commented out (was for testing purposes)\n",
    "    # llama_hub_url=\"https://raw.githubusercontent.com/run-llama/llama-hub/jerry/add_llama_packs/llama_hub\",\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "353ce36f-bc63-4d01-be17-bdcc4bbf1651",
   "metadata": {},
   "outputs": [],
   "source": [
    "hybrid_fusion_pack = HybridFusionRetrieverPack(\n",
    "    nodes, chunk_size=256, vector_similarity_top_k=2, bm25_similarity_top_k=2\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "196eefd8-9baa-4d9c-8962-c12abb5f4bfd",
   "metadata": {},
   "source": [
    "### Run Pack"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "8529d01b-807b-4009-b6e9-0f709382abcd",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Generated queries:\n",
      "1. What is YC and how is it related to the author?\n",
      "2. What are some notable achievements or projects of the author during his time in YC?\n",
      "3. Can you provide a timeline or overview of the author's activities and contributions while in YC?\n"
     ]
    }
   ],
   "source": [
    "# this will run the full pack\n",
    "response = hybrid_fusion_pack.run(\"What did the author do during his time in YC?\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "c2530326-e79e-4448-b966-d95e9a93802f",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "During his time in YC, the author worked on various tasks related to running the program. He mentions that every 6 months, a new batch of startups would come in, and their problems became the author's problems. He found this work engaging and learned a lot about startups. He also mentions that there were parts of the job he didn't like, such as disputes between cofounders and dealing with people who maltreated the startups. However, he worked hard even at the parts he didn't like because he wanted YC to be successful.\n"
     ]
    }
   ],
   "source": [
    "print(str(response))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "6556c1bd-c345-4d26-ae47-b1c78e82daa9",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "len(response.source_nodes)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c20f770e-ac42-4e7f-ae1a-e622731a02ad",
   "metadata": {},
   "source": [
    "### Inspect Modules"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "id": "9dd355aa-b04f-4c1d-9287-1ed93df46698",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'vector_retriever': <llama_index.indices.vector_store.retrievers.retriever.VectorIndexRetriever at 0x172882a70>,\n",
       " 'bm25_retriever': <llama_index.retrievers.bm25_retriever.BM25Retriever at 0x17288f400>,\n",
       " 'fusion_retriever': <llama_index.retrievers.fusion_retriever.QueryFusionRetriever at 0x17288f490>,\n",
       " 'query_engine': <llama_index.query_engine.retriever_query_engine.RetrieverQueryEngine at 0x172afdf60>}"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "modules = hybrid_fusion_pack.get_modules()\n",
    "display(modules)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e4883f8a-b466-4135-addd-06ed8bc98123",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "llama_hub",
   "language": "python",
   "name": "llama_hub"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.8"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
